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(57) Abstract: A method is presented for controlling a process to be applied to a patterned structure in a production run. Reference 
data is provided being representative of diffraction signatures corresponding to a group of different fields in a structure similar to 
the patterned structure in the production line (PL). The group of different fields is characterized by different process parameters 
used in the manufacture of these fields. The method utilizes an expert system (14) trained to be responsive to input data represen- 
tative of a diffraction signature to provide output data representative of corresponding effective parameters of the process. Optical 
measurements are applied on the patterned structure in the production line to obtain diffraction signatures of thereof and generate 
corresponding measured data. The expert system analyses the measured data to determine effective parameters of the process applied 
to the patterned structure in the production line. 
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Process Control for Micro-Lithography 



FIELD OF THE INVENTION 

This invention is generally in the field of process control techniques, and 
relates to a method and system for controlling a process of manufacturing patterned 
structures, such as photolithography and etching processes. 

BACKGROUND OF THE INVENTION 

The currently common methods for process control in photolithography, 
particularly micro-hthography, are based on the use of CD-SEM. The latter is a 
stand-alone tool, which performs measurements of critical dimensions (minimal 
lateral dimensions of a pattern) for creating Statistical Process Control (SPC) trend 
charts for further monitoring thereof. One of these methods involves creating a so- 
called "Focus-Exposure Matrix" (FEM), produced by varying the focus and exposure 
(energy) parameters of the Uthography from field to field within the wafer, thereby 
producing a two-dimensional array of fields spanning a range of these parameters. 
By detennining CD in each of the FEM fields, optimal values of the focus and 
exposure, as well as their allowed tolerance (process window), are determined for 
each specific process. 

Recently, tools based on scatterometry have been developed, which provide 
for higher accuracy and repeatability, faster measurement, smaller volume and lower 
cost, as compared to CD-SEM tools. Such scatterometiy-based tools are disclosed, 
for example, in US Patents Nos. 5,867,276 and 5,963,329; and in the following 
publication: "Specular Spectroscopic Scatterometry in DUV Lithography", Xinhui 
Niu et al, SPIE Vol. 3677, SPE Conference on Metrology Inspection and Process 
Control for Microlithography XIII, pp. 159 - 168. 
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Scatterometry is a method by which the optical signature (spectral response) 
of a periodic structure is measured. The signature can be obtained by measuring the 
optical properties of a structure (reflectance or ellipsometric parameters) as a function 
of one or more light parameters, e.g., the angle of incidence, polarization or 

5 wavelength. Due to the periodicity of 1he structure, it is possible to theoretically 
calculate toe signature of a given sample using exact models thereof (e.g., utilizing a 
Rigorous Couple Wave Theory (RCWT)). Processing is thus performed by 
correlating toe measured signature to theoretically calculated signatures, while fitting 
toe structure's parameters. This fitting method suffers from such drawbacks as long 

10 calculation time, in-adequacy to real-time calculations, and toe need for detailed 
knowledge about toe shucture (e.g., optical constants) that is required as input to toe 
model. The problem of long calculation time is usually overcome by preparing a 
library 0 f pre-calculated signatures. This procedure, however, requires a long setup 
time. The detailed knowledge about toe structure, in many cases, also requires 

15 prehroinary setup processes, such as material characterization. Additionally, toe 
measurement is limited to periodic structures that do not usually exist within toe die, 
thus requiring fabricating special test structures and correlating toe measurements on 
these test structures to measurements taken within toe die. Yet another problem is toe 
complicated, sometimes indirect relation between toe process parameters (e.g., focus 

20 and exposure) and toe profile parameters, rendering toe attempt to control toe process 
by modifying process parameters based on profile information, which is difficult to 
implement These problems impede toe application of scatterometry-based systems 
as a production tool, specifically for integrated monitoring toat require a fast 
feedback for process control. 

25 According to another technique, disclosed in toe article "PhiScatterometry 

for On-line Process Control", N. Benesch et al, AEC/APC Symposium XII, Lake 
Tahoe, Nevada, USA September 23-28, 2000, toe signatures measured in different 
fields of a Focus-Exposure Matrix are classified using a neural network (NN) under 
those found within toe control limits and those found outside of them In other 

30 words, mis technique provides only "pass'Tfail" inforrnation which allows a Process 
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Alarm to be operated. However, no quantitative information is provided, therefore 
feedback to the process (adjusting the working parameters of the processing tool in a 
closed loop control) is impossible. 

SUMMARY OF THE INVENTION 

There is accordingly a need in the art to facilitate the control of a process of 
manufacturing patterned structures, particularly micro-lithography, by providing a 
novel control method and system. The present invention introduces a methodology 
that starts with identifying those major process parameters whose variation affects the 
process results. This new methodology also directly exploits the dependence of the 
measured signature on the process parameters, without requiring any model having 
predictive capabilities with regard to the way this dependence is manifested. 

The invention is particularly useful for controllinga lithography process, 
wherein focus and exposure are among the dominant factors affecting the 
lithographical profile (critical dimensions, wall angle, etc.). These parameters are 
usually considered in order to control the lithography process and keep the resulting 
profile within the required control limits. The new methodology bypasses the main 
limitations inherent in conventional scatterometry as presented above. 

In the description below, the following terms as used: 

The term "parametric matrix" or in short "matrix" used herein signifies a set 
of patterned structures (wafers) and/or fields created within the structure(s), that were 
fabricated using different values of one or more working parameters of the process to 
be controlled Consequently, the term "matrix field" or in short "field* signifies one 
specific part of a parametric matrix, being a wafer or part of a wafer, having a 
specific value or set of values of the working parameters. All fields are supposed to 
include equivalent measurement sites, not necessarily in the same locations. 

The term "measurement site" or in short "site" refers to a specific location 
found within each matrix field where the signature measurement is actually being 
taken. 
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The term "signature" signifies an optical response of the structure to 
predetermined incident fight Such a signature may be measured as a diffraction of 
light interacting with the structure as a function of a light parameter such as 
wavelength (spectrum), angle of incidence, ellipsometry, etc. The term "signature" 
5 refers to the total optical information that can be attained from a certain field, 
including several measurements taken at different measurement conditions, different 
measurement tools and/or at different measurement sites within the same field. 

The term "reference tool results" signifies the results of one or more 
measurements applied to the parametric matrix or a part thereof by reference tools 
10 other than the measurement apparatus of the present invention 

The term "reference data" refers to all data available in order to perform the 
setup process (training of the NN), including mainly but not only signatures 
measured on a group of matrix fields and reference tool results from corresponding 
fields, as well as toe processing conditions of the same field and any other sort of 
15 information available on these fields. 

The term "control window" or '"process window" signifies a range or ranges 
of one or more working (process) parameters providing desired process results. 

The term "merit function" refers to a function that gets two signatures as 
input and results with a single number that is some measure of the "distance" 
20 between the two input signatures. 

There is thus provided according to one aspect of the present invention, a method of 
confrolling a process to be applied to a patterned structure in a production run, the 
method comprising toe steps of: 

(a) providing reference data including data representative of 
25 diffractioh signatures corresponding to a group of different 

fields in a structure similar to said patterned structure in the 
production line, and data representative of a control window 
for toe process parameters corresponding to a signature 
representative of desired process results, said group of 
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different fields being characterized by different process 
parameters used in the manufacture of these fields; 

(b) providing an expert system trained to be responsive to input 
data representative of a diffraction signature to provide output 

5 data representative of corresponding effective parameters of 

the process; 

(c) applying optical measurements to at least one site on said 
patterned structure in the production line to obtain at least one 
diffraction signature of said patterned structure in the 

io production line and generate data representative Ihereof; 

(d) supplying the generated data to said expert system, which 
analyses the data to thereby determine effective parameters of 
the process applied to said patterned structure in the 
production line; and 

15 (e)analyzing said effective process parameters to determine 
deviation Ihereof from corresponding nominal values to 
thereby enable the process control. 
The reference data is created during an off-line operational mode (cahbration 
procedure) consisting of the following. The process to be controlled is applied to 
20 different fields on a test structure (wafer) or to different test structures, utilizing 
different values of one or more working parameters of the process, thereby preparing 
a parametric matrix including variations of at least one working parameter. When 
dealing with a hthography process, such a parametric matrix is typically an F-E 
matrix (FEM). . The FEM is printed by using the same or similar mask as mat used in 
25 toe production run, varying the exposure along one axis and me focus along toe omer 
axis of a two-dimensional field array. Then, measurements are applied to the test 
wafer(s) in order to determine the signatures corresponding to a group of different 
fields. 

Optionally, additional measurements are applied to the same field using 
30 reference tools (e.g. CD-SEM, Cross Sectional SEM, AFM) and their results are 
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added to die reference data For example, when dealing with a limography process, 
CD-SEM values measured on die same F-E matrix fields may be used as reference 
tool results. The setup process is finalized by using the entire reference data as a 
training S et for training an expert system, e.g., an artificial neural network (NN). In 
5 mis stage, the NN is trained to return the process parameters and any available 
reference tool results upon receiving signatures as input. Once the NN is properly 
trained to do this with the training set data, i.e., with the reference data, it will usually 
also be able to find the correct process parameters when inputting a new signature. 

During an on-line operational mode (real-time measurements during a 
io production run), optical signatures are measured at different sites on the real 
structure, and supplied as input to the NN. The NN 1hen outputs the effective 
process parameters corresponding to each signature. Closed loop process control can 
than be performed using the deviations of the effective process parameters from their 
corresponding nominal values. For example, in the lithography case, if measured 
15 signatures consistentiy show that the effective exposure is lower than the nominal 
one, correction may be applied to the process by means of increasing me exposure 
value. This process control method thus overcomes one of the problems of 
conventional scatterometry: eliminates the need to explicitly translate a change in 
profile parameters into a required change in process parameters, since the required 
20 change in process parameter is a direct output of the method of the present invention. 
In feet, with respect to the present example of the hthography process, even if the 
source of the profile change is other than the exposure, the profile may still be 
accurately corrected by changing the exposure. The nature of the closed-loop-control 
may include different methods such as feedback, feed-forward, etc. 
25 Preferably, the method of the present invention also includes re-training of me 

NN. As described above, a set of fields is used in order to prepare the reference data 
hy spanning the process window in several selected process parameters. In the 
Uthography example, the use of a single wafer allows spanning the process window 
both in exposure and focus. However, there are many other process parameters that 
30 are assumed to be constant and are not actively varied throughout the selected set of 
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fields, so-called "hidden parameters". In the lithography example, when using a 
single FEM wafer, film thickness for all the stack films are hidden parameters. 

The values of hidden parameters may later on be changed, thereby affecting 
the diffraction signature in a manner that was not taken into consideration when 
5 training the NN. This introduces errors into the NN results. One way to eliminate 
the effect of hidden parameters is to c ^lnhide" them, i.e., to expand the matrix so as to 
include the hidden parameters as active parameters of the matrix. For example, 
photoresist film thickness may become a matrix parameter by producing several 
FEM wafers having deliberately different film thicknesses. By applying 

10 measurements to all wafers and including all the measurement results into the 
reference data used to train the NN, the effect of the variable film thickness may be 
taken into account in the same way as focus and exposure parameters. 

There are two different ways to treat the expanded matrix. According to one 
embodiment of the method, a new parameter (photoresist film thickness in the above 

15 example) should be a control parameter to be treated on equal footing as the other 
parameters. To this end, the value of the new parameter must be known at each field 
and added to the reference data. In the Uthography example, each wafer has to be 
measured for its film thickness and the NN has to be trained to output F, E and film 
thickness values. However, it is also possible to train the NN with all the field 

20 signatures, regardless of which wafer they come from, without requiring the NN to 
output the new matrix parameters) (film thickness). In this case, the NN will learn 
how to provide the correct values for the existing matrix parameters (F and E) 
regardless of the values of the new matrix parameters) (film thickness). Since their 
exact values are not required, the values for the new parameters (film thickness) do 

25 not need to be measured in this case. Therefore, in this case, the effect of the 
expanded matrix consists of "immunizing" the NN from possible errors coming from 
variations in the new parameters) (film thickness). Such additional parameters that 
are sampled but not controlled will be referred to as "sampled parameter", in contrast 
to the "control parameters". 
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It should be noted that expanding the matrix means preparing and measuring 
additional wafers, resulting in an increase in setup time and effort. % for example, 
there are n sampled parameters and each of diem needs be sampled m times across 
their corresponding allowed ranges, then die expanded matrix size will be rrP times 

5 die size of original matrix, including only the control parameters, making the 
expanded matrix impractical in many cases. To partly overcome this problem, a 
"sparsely expanded matrix" can be used in which die sampled parameters are not 
fully sampled for all possible cases, but rather sampled sparsely. K; for example, 
there are two sampled parameters, and each needs in principle to be sampled 5 times, 

10 then instead of multiplying the matrix size by 5 2 =25, a sparse sampling of as few as 
5-9 cases may supply most of the required information in order to immune die 
system from variations in these two parameters. 

An effective way to immune the system to possible errors is to use the 
naturally occurring distribution of the hidden parameters. By randomly choosing a 

15 group of fields with varied production conditions (e.g., coming from different wafers, 
different lots, different tools or different time), the reference set may naturally be 
populated with fields having the correct distribution in all hidden parameters. Such a 
training set may allows the NN immunization to take place without actively sampling 
a large number of hidden parameters (which may even be unknown) and without die 

20 need to produce and measure a large number of fields. 

One specifically attractive way to sample the naturally occurring distribution 
is to utilize wafers that are anyhow produced for regular periodical tests. For 
• example, FEM wafers are routinely produced (in many cases, daily) in order to verify 
the stability of the production process and follow variations. Thus, by adding each 

25 time (day) additional information about new wafers, it is possible to improve the 
immunity of the system with time. 

Additionally, the method of the present invention may include immunizing the 
NN to possible machine errors that are common in different measurement tools. If, 
for example, the measurement tool may have gain and offset errors, it is possible to 

30 immune the system to such errors by simulating them. In this case, after measuring a 
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set of sigaatures for a set of fields, machine errors are simulated by artificially adding 
bias and gain factors to the signatures. Each signature may be thus duplicated several 
times, applying to different duplicates different amounts of gain and ofiset, thus 
producing additional sampled parameters to the matrix. 

A set of signatures measured on the parametric matrix can be used as a 
signature library (look-up-table), which may be part of the reference data. According 
to this embodiment, during production run, every measured signature is compared to 
the signatures stored in the signature library, while searching for the signature that is 
the closest to the measured one. The search is carried out using a "merit function" 
that measures the level of fitting between any two signatures. 

The signature library can also be used for verification when using an NN for 
the fitting process. Verification is needed, since the NN-based method, as described 
above, has no internal way of measuring how successful the interpretation is, i.e., 
what reliability the user should attribute to the results. Verification may be done in 
one of the following ways: 

1. Searching the signature library and comparing the NN results to the search 
results: if the results are sufficiently close in the control parameters space, then the 
NN result is assigned a high reliability score; 

2. Looking in the library for the signature whose "coordinates" in the control 
parameters 5 space are the closest ones to the NN result; and comparing this library 
signature with the newly measured signature: if the merit function between the 
library signature and the new signature is sufficiently low, the NN result gets a high 
reliability score. 

The data collection and analysis technique of the present invention enables to 
overcome most of the inherent disadvantages of conventional scatterometiy, namely: 

- The setup of the present invention is simpler and requires neither deep 
understanding of the application nor long calculations. The setup may therefore be 
used for any structure, regardless of the complexity of underlying layers or the 
structure of the profiles, without the need to find the optical characteristics of the 
material involved It may also be easily used by operators having minimal training. 



WO 02/37186 



PCT/BL01/01007 



-10- 

_ The technique of the present invention can be applied to any measurement 
site, and not only to line gratings. Among the possible sites may be, for example, 
hole arrays, memory cells or any other diffracting structure. The only desired 
conditions are as follows: (1) the measured signal should not strongly depend on the 
5 exact measurement location, within the positioning accuracy of me system, and (2) 
light diffraction should be sufficiently strong to make the optical signal sensitive to 

the process parameters. 

- Real time measurement is very fast, regardless of the application complexity 

- Close loop control is directly available, as the output is already given in 
10 terms of the control parameters. 

It should also be noted that, in distinction to the prior art classification method 
utilizing a learning system, the method of me present invention allows quantitative 
process control, while the classification technique allows solely process alarm This 
is due to the fact mat the output obtained with the invented method is continuous, 
is while the only output obtainable with the prior art technique is indicative of whether 
the profile is wittun the process window or not Additionally, the method of the 
present invention has several mechanisms allowing it to be immune to different 
rmturauy-occurring variations mat may affect the interpretation, making it more 
robust 

20 According to another aspect of the present invention, there is provided a 

production line for nianufacturing patterned structures in a production run 
comprising: 

(One) a processing tools arrangement characterized by certain values of its 
working parameters; and 
25 (Two)an optical measurement apparatus operable to apply a measurement to 
the structure and detect a diffraction signature indicative of light response of 
the structure to incident light said diffraction signature varying with a 
change in at least one of said working parameters; and 
(Three) an . expert system trained to be responsive to input data 
30 representative of a diffraction signature to provide output data representative 
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of coiresponding effective value of said at least one working parameter, 
thereby enabling analysis of said effective value to determine deviation 
thereof from a corresponding nominal value and allow control of said 
process. 

BRIEF DESCRIPTION OF THE DRAWINGS 

In order to understand the invention and to see how it may be carried out in 
practice, a preferred embodiment will now be described, by way of non-limiting 
example only, with reference to the accompanying drawings, in which: 

Fig. 1 is a schematic illustration of a part of production line in the 
manufacture of semiconductor devices utilizing the present invention; 

Fig. 2 schematically illustrates a FEM wafer prepared for the purposes of the 
present invention; 

Fig. 3a illustrates different signatures coiresponding to a process utilizing 
different exposure values and the same focus value; 

Fig. 3b illustrates different signatures corresponding to a process utilizing 
different focus values and the exposure value; 

Figs. 4 and 5 graphically illustrate simulation results of the method of the 
present invention; 

Figs. 6a and 6b illustrate the main steps in the method according to the 
invention. 

DETAILED DESCRIPTION OF THE INVENTION 

More specifically, the present invention is used for controlling a 
photolithography process used in the manufacture of semiconductor devices, and is 
therefore described below with respect to this application. It should, however, be 
understood that the method of the present invention can be used for controlling any 
other process of the kind wherein variation of the process parameters affects the 
diffraction signature of a patterned structure being processed. 
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Refening to Fig. 1, there is illustrated a part of production line PL utilizing 
the present invention. In the present example, the production line part PL is 
composed of a photolithography tools (FT) arrangement 10, a measuring apparatus 
12 and a control unit 14. The construction of the phototithography tools arrangement 

5 does not form a part of the present invention, and therefore need not be specifically 
described, except to note its main constructional parts such as wafer loading- 
unloading, coating, exposure and developing tools, and a robot The measuring 
apparatus is designed to be installable \rithin the photohthography tools. 

The measuring apparatus 12 presents an optical measuring system capable of 

io measuring optical signatures (diffraction signatures) from sufficiently small areas of a 
wafer, and may utilize any known tool used in scatterometry measurements. The 
control unit 14 is a computer device installed with a programming utility, which is the 
so-called "expert system" (preferably, neural network) containing signal processing 
and computational intelligence for decision-making. Such a neural network (NN) 

is utilizes a logic utility based on decision tables and a learning mode of operation for 
periodically updating Ihe decision tables, in view of measurement and analysis 
results. 

Optionally, a reference measurement tool 16 is used, which is an offline tool 
operable to apply measurements to the same sites measured by the measuring 
20 apparatus 12 or to other sites within the same fields and provide structure information 
from these sites. Such a reference tool may be a measurement unit of any known 
type using non-optical techniques, such as SEM or AFM, or optical techniques, such 
as a scatterometry tool. 

The basic setup procedure carried out by the measurement system 12 will 

25 now be described. 

The setup procedure of the measurement system 12 includes preparation of 
reference data, and the training of an expert system The preparation of reference 
data includes the preparation of a parametric matrix and the measurements with the 
system 12, which consist of the following: 
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The parametric matrix is prepared by applying a set of production processes 
to the test wafers, to thereby process different wafers or different wafer fields using 
different working parameters. In the present example, the processing tool to be 
controlled is the exposure tool, and the varying working parameters arc focus F and 
exposure (energy) E. 

As shown in Fig, 2, the test wafer TW includes different fields, generally 
designated Fi, presenting the distribution of different focus-energy values, i.e., a 
focus-energy matrix (FEM). Preferably, this distribution is carried out in a randomly 
scattered manner on the wafer, and not along rows and columns as used in the 
conventional technique 0- This is associated with the fact that some non- 
homogeneity in wafer's parameters, such as underlying layer thickness, is position- 
dependent (e.g., depends on the wafer's radius). By randomly scattering the FEM 
fields, a situation can be achieved such that for each set of close thickness values, 
several fields of the FEM are obtained which represent different parts of the matrix. 
Otherwise (i.e., in the conventional technique), a correlation might occur between 
FEM parameters and thickness values which makes it more difficult for the expert 
system to generalize the measurements to different thickness values. 

Since the present example provides a possibility to span the two control 
parameters on a single wafer, the use of a single test wafer may be sufficient for the 
initial training of the expert system (NN). Optionally, it is possible to expand the 
parametric wafer by producing additional test wafers, similar to the first one, having 
one or more of the "hidden parameters" (e.g., the photoresist film thickness) 
deliberately changed from wafer to wafer. The field distribution in the additional test 
wafer may be similar or different from the distribution on the first test wafer. The 
distribution of the additional sampled parameters should represent the naturally 
occurring distribution of these parameters, or at least should have a similar range. 
The additional parameters are not necessarily fully and orderly sampled in the matrix, 
but may be sparsely of even randomly sampled. Such a feature of the present 
invention, as the extension of the parametric matrix enables "immunization" of the 
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NN to variations in these parameters mat otherwise would have resulted with 
significant errors. 

Once the parametric matrix is ready, all or part of the fields ate measured by 
applying the measuring apparatus (12 in Fig. 1) to the test fields. The measured 
diffraction signatures are stored, along with the production conditions of the 
measured fields (i.e., parameters of the process). Figs. 3a and 3b illustrate the 
spectrum variations for respectively, nominal focus value and variable exposure, and 
nominal exposure value and variable focus. 

Optionally, the reference measurement tools (16 in Fig. 1) is applied to the 
same test fields, and measurement results are added to the reference data. For 
example, if CD-SEM is used as the reference tool, then all FEM fields for which 
optical (diflraction) signatures were measured are also measured by the CD-SEM, 
and CD values obtained with this tool are stored for each field, along with the 
process conditions of that field and the optical signatures measured in that field. The 
bulk of all measurements, both of the measurement apparatus (12) and of the 
reference tools (16) stored in correlation with each field's production parameters, is 
generally referred to as the "reference data" or "signature horary''. It should be noted 
that the test wafer (fields) may be measured by the reference tool at the same state as 
measured by the measuring apparatus 12, or at another state, e.g., after additional 
processes have been applied to the wafers or before some process were applied 
thereto. For example, whereas the measuring apparatus 12 may be measuring the test 
fields after the photoresist development stage, the reference tool measurements may 
be taken after the subsequent etch and clean stages. Alternatively, the reference 
measurements may be electrical measurements carried out on other structures in the 
same fields, at the end of toe line. This possibility allows the system to correlate the 
signature measured in one (preparatory) stage with the outcome of a later stage. 
Using the reference tool allows for derating a process window, e.g., a range of 
allowed CD values. Correspondence between the control parameters (F-E) and the 
reference tool results (CD) results in a range of allowed control parameter sets (F-E), 
which could be used later on during production run. Based on the reference tool 
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data, the nominal values for the control parameters (F-E) can also be determined as a 
set of parameter resulting in the best structure (profile). 

The expert system training consists of the following: 

The expert system (preferably NN) receives a signature (in the extended 
sense, as defined above) as input data, and generates output including data indicative 
of the control parameters, which are those parameters of the parametric matrix that 
are used for the control of the process (F-E), and optionally also data indicative of the 
reference tool results (CD). The expert system is now trained using the reference 
data as a "training set 11 so as to be capable of providing correct output data (within 
certain allowed errors) upon receiving the signature of any field This training is 
carried out using the same data, while changing the parameters of the NN, until an. 
acceptable error is reached for all fields. Care has to be taken to avoid "over-doing" 
of the training process, since that will reduce the NNTs ability to generalize (as 
described below). 

During a production run, measurements and process control are performed in 
the following manner: 

Production wafers are measured in one or more of the sites that were 
measured on the test wafers and measured data indicative of one or more diffraction 
signatures, respectively, is generated The measured signature is input to the trained 
NN, which thereupon outputs the effective control parameters (F-E) corresponding to 
that signature, and optionally the corresponding effective reference tool results (CD). 

It should be noted that, in this case, the results of the NN present "effective 
control parameter" and not "control parameters", as they may and may not be the real 
control parameters with which the field was actually produced 

The ability of the NN to provide reliable results to signatures that were not 
included in the training set presents the ability of NN to generalize. The 
generalization is expressed as the NN's ability to successfully interpolate and, to 
some extent, extrapolate between and beyond the cases that it was trained for. 
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Then, either a process alarm (PA) or a closed-loop process control (CLQ is 
performed based on the above results. In the case of the process alarm, the effective 
control parameters (F-E) are compared to die process window, given as a range of 
these parameters. If the effective control parameters are outside the process window, 
the PA is operated, e.g., by means of a message sent by communication to the 
production tool or the fab host computer, or by means of an audio-visual sign for the 
operator. 

It should be understood that, in distinction to the method described in the 
above-indicated article of Benesch et aL, according to which the NN is trained to 
directly provide a result indicative of that the measured signature (field) is within or 
out of the process window without giving any indication of the direction of the 
deviation, the method of 1he present invention provides for applying PA only after 
quantitative results for the process parameters have been obtained. 

In the case of process control, the effective control parameters found by the 
NN for each measured field are compared to the nominal control parameter values. 
The direction and magnitude of the deviations from nominal values (optimal profile) 
can then be used for correcting the process parameters' values used in the production 
of subsequent wafers, such that the deviations are reduced in subsequent wafers 
(feedback). Alternatively, the same deviations may be used in following the 
production steps applied to the same wafer in order for the final result to be closer to 
the nominal one (feed-forward process control). The algorithm by which such 
feedback/feed-forward may be realized contains the following main features: a 
filtering mechanism reducing sporadic atypically large deviations and reducing the 
effect of random noise; a tendency to reduce constant errors to zero; and a 
mechanism by which the algorithm calibrates the effect of its own actions on the 
stability of the system. 

It should be noted that the invented method as described above is capable of 
solving problems that may not necessarily originate from fluctuations of the control 
parameters, but rather from changes of hidden parameters. t£ for example, CDs are 
consistently too large due to some change in the post-exposure bake or in the 
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development step, changing the exposure may bring the CDs back to meir nominal 
values, compensating for me errors in olher steps by deliberately changing Hie 
exposure control parameters. The recommended direction of such changes is 
immediately identified from the effective control parameters assigned to the 

5 measured signatures. If, for example, the effective exposure is consistently higher 
than the nominal one, it actually means that by reducing exposure below its current 
value (nominal or not), the CDs may be brought closer to nominal. 

The present invention additionally provides for the validation of the expert 
system results and for the re-training of Ihe expert system in time. 

10 In order to validate the result of me NN, several methods may be used. One 

such validation method consists of searching through me signature library stored 
during the setup procedure for the library signature that best matches the newly 
measured signature. The matching quality is defined by a merit function. 
Interpolation between library signatures may than be applied in order to improve the 

15 final result The control parameters of the best-matched signature may then be taken 
as an estimate for the signature's effective control parameters. The search results may 
either be used by themselves for the purpose of process control (i.e., as the final 
results), or facilitate validation of me rehability of toe NNs result. In me latter case, 
the effective control parameters as determined by the NN, are compared to those 

20 determined by the search method, and, if the difference between the two results is 
below a predetermined threshold, the NN results are considered reliable. 
Alternatively, the field whose parameters are the closest ones to the NN's results may 
be found, and its corresponding signature, as stored in the signature library, may be 
compared with the newly measured signature using a merit function If such a 

25 comparison yields a merit function lower than a predefined value, the result of the 
NN is considered valid and used for process control. Yet another alternative method 
for vahdation consists of utilizing additional, un-pattemed sites. Such sites may be 
measured on the test wafers and stored as an additional part of the reference data. 
Once a newly measured signature is found to be outside the process window, an 

30 equivalent un-pattemed site may be measured on the currently measured wafer. 
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Upon comparing the signature of the newly measured un-patterned she with its 
counterpart stored in the reference data, the system can determine whether the 
observed deviation is likely to be due to the biographical process or due to 
variations unaccounted for in previous production steps. 
5 The retraining procedure consists of the following: 

In many cases, F-E matrix wafers are routinely produced in order to calibrate 
processing tools and check the stability of the process. Adding additional wafers' 
data to the matrix serves such important purposes as increasing the amount of data 
available (thus reducing random errors), and extending the range of the matrix in 
10 parameters that are changing in time. Since some parameters change with time very 
slowly, the only way to sample mem (i.e., find wafers having different values of these 
hidden parameters) is by sampling mem over time. Once the NN is re-trained using 
the additional wafers, it will be immuned to these parameters as they change. Thus, 
me effect of re-training provides for continuously improving the NN immunization to 
15 the hidden parameter effect 

The following is an example (simulation) showing how the method of the 
present invention can be used in a simple case proving the capabilities of the method. 
The structures used in Ibis simulation contain a trapezoid SiCfe line profiles on top of 
two un-pattemed layers ("underlayers") on a Si substrate. In order to simulate a F-E 
20 matrix, a 6x6 matrix was prepared with CD changing along one axis of the matrix 
(simulating changes in exposure), and wall-angle changing along the other axis of the 
matrix (simulating changes in focus). The film thickness of the underneath layers 
was treated as sampled parameters in the following manner. The range of each film 
thickness was defined as approx. +/-10% from the respective nominal value. Each 
25 parameter was sampled 5 times within this range, thereby creating a 5x5 matrix. The 
full parametric matrix now included 5x5x6x6 fields, as for each combination of 
underlayer thickness toe entire "F-E" matrix (CD-angle in the simulation) was 
created. The measurement tool used for the simulation was a broadband 
reflectometer measuring the reflectance of the sample as a function of wavelength. 
30 Spectra were calculated using a physical model to simulate the measured signatures 
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for all the fields of the parametric matrix Part or all of the signatures were then used 
for the training set (as described below), the required output parameters being a CD 
and wall angle (as simulators of Exposure and Focus). In order to test the ability of 
the NN, once trained, to generalize and give satisfactory results also to "new" 

5 signatures, a second "test set" of signatures was prepared The test set included a 
similar matrix, which this time was computed for CD/angle values being half-way 
between the values of the training set This test set thus represented the worst case in 
terms of the net's capability to interpolate. Once the NN had been trained, the test set 
was supplied to the NN as input data and the difference between the NN*s results for 

10 the test set and the real values (with which the test set signatures had been calculated) 
were registered. Henceforward, the average of these differences over the whole test 
set was used as the measure for the NNTs success to generalize and produce correct 
results. 

Two different cases were tested: with and without sampling of the 
15 underlayers. In the first case, only the nominal values of the underlayers were used 
in order to train the NN. Test sets with varying thickness values of one of the 
underlayers were then tested on the NN. The resulting mean error in the CD as a 
function of the underlayer thickness is shown in Fig. 4. As can be seen, the mean 
CD error for the nominal thickness value (25nm in this example) is well below Inm. 
20 However, this error strongly increases with the change in the underlayer thickness. 
In the second case, FEM wafers with three different values of underlayer thickness 
(24nm, 25nm and 26nm) were used for training the NN. As can be seen in Fig. 5, the 
net provided correct output values of CD (with errors well below lnm) not only for 
the 3 cases it had been trained for, but also for larger and smaller values of the 
25 underlayer thickness for which it had not been trained, showing very good 
extrapolation capabilities. 

In another case, a set of 5 "FEM" wafers covering the center and the corners 
of the 5x5 thickness matrix were taken for the training set In this case, the test set 
included all fields from the full 2D range of thickness values, and again, the results 
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show the ability of the NN to generalize and create effective interpolation and 
extrapolation 

It should be noted that the above simulations did not include the use of 
reference tool data, which is optional for the purposes of the present invention 

Thus, the trained expert system is capable of analyzing a measured optical 
signature taken as a real-time measurement during a production run, so as to carry out 
at least one of the following: select corresponding control parameters (F-E) values 
from the existing signature library (search); determine whether the real-time 
measured data is within the control window or not (classification); and estimate 
effective control parameter (F-E) values corresponding to the real-time measured 
signature (interpolation/extrapolation). These analyses' results, each one or as a 
combined set, may thereafter be utilized for the control of the production process. 

Figs. 6a and 6b illustrate the flow diagram of the main operational steps in a 
method according to the present invention. 

As shown in Fig. 6a, the following steps are carried out off-line, i.e., on the 
test wafer (FEM): (1) reference data is prepared by applying signature measurements 
to the previously developed test wafer TW (e.g., by the scatterometer 12), (2) process 
window PW is determined on the test wafer after being etched (e.g., by CD-SEM), 
and (3) the expert system is learned (trained). 

As shown in Fig 6b, during the on-line operational mode of the system, me 
signatures (measured data) are obtained on a real wafer and processed in the expert 
system to determine whether the measured data (F-E values) are within the process 
window or not If the measured data are outside the process window, the corrected 
values of the working parameters are determined to be used for tuning the processing 
tool prior to applying it to further coming real wafers. 

Immunity may be achieved by training the learning system not only using me 
measured signals from the FEM, but also from transformations thereof Such 
transformations may be produced by modifying the measured signals according to 
predetermined rules, e.g., by applying different gain and offset factors, or by adding 
random noise. Alternatively, the transformations can be used only as a vahdation 
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step that deteraiines at what stage the training process should stop, in order to avoid 
over-fitting. 

The same may be achieved by selectively choosing those parts of the 
signature, e.g., specific spectral ranges, which present higher sensitivity to the F-E 
5 parameters, as compared to other parts of the signature. To this end, the variability 
(e.g., standard deviation) for each part of the spectrum (e.g., each spectral point) over 
the entire F-E matrix can be calculated. Those parts that are not sensitive or are 
relatively less sensitive will be disregarded, in order to minimize the effect of other 
parameters on the NN results. This technique may be further refined using the 

10 variability function vs. the signature's free parameter (e.g., wavelength) as a character 
according to which the NN's sensitivity is tuned. Such tuning may be performed by 
adding to the training set duplicates of real measurements that have to be somewhat 
distorted by adding a random noise, such that those parts of the signature that are 
more sensitive to the F-E parameters are least distorted and those that are less 

15 sensitive to F-E are most distorted. Using these additional signatures to train the NN, 
the net's sensitivity to those parts of the spectrum that have been distorted is reduced, 
achieving the same result 

The NN can be trained using data that comes from several different FEMs 
(wafers). Ass uming that the number and choice of wafers is sufficiently large to 

20 represent the variability of other parameters, such as layer thickness, the NN can 
generalize the result from the given examples and reduce dramatically the effect of 
variations in these parameters on F-E results. Indeed, when using several wafers, it is 
possible to achieve good training even without using the entire FEM on each wafer. 
Thus, if sufficient (random) variability of the perturbing parameters exists on the 

25 same wafer, it is possible to get the same immunity to variations in these parameters 
using a single wafer. 

As can be seen from the example described above with reference to Figs. 4 
and 5, the errors are minimal for the center value of the layer thickness that 
corresponds to the training set (e.g., 25nm-wafer). However, errors increase 

30 dramatically once the layer thickness changes in either direction from that of the 
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training set Using me training set including three "wafers" - 24nm, 25nm and 
26nm, die error is reduced dramatically not only for the framing values, but also for 
other values of layer thickness which he outside the training set Hence, it is evident 
mat the net generalizes the information allowing extrapolation (as well as 
interpolation, which is easier) to other layer thickness values. 

As indicated above, immunity may also be gained by distributing the FEM 
fields in a "randomly scattered" manner on the wafer. Generally speaking, the 
immunity can be optimized by utilizing various "hidden parameters" in the learning 
mode of the NN, such as layers thicknesses, gain, offset etc. 

Those skilled in the art will readily appreciate that various modifications and 
changes can be applied to the preferred embodiments of the invention as hereinbefore 
exemplified without departing from its scope defined in and by the appended claims. 
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CLAIMS: 

1. A method of controlling a process to be applied to a patterned structure in a 
production run, the melhod comprising the steps of: 

(a) providing reference data including data representative of 
diffraction signatures corresponding to a group of different 
fields in a structure similar to said patterned structure in the 
production line, and data representative of a control window 
for the process parameters corresponding to a signature 
representative of desired process results, said group of 
different fields being characterized by different process 
parameters used in the manufacture of these fields; 

(b) providing an expert system trained to be responsive to input 
data representative of a diffraction signature to provide output 
data representative of corresponding effective parameters of 
the process; 

(c) applying optical measurements to at least one site on said 
patterned structure in the production line to obtain at least one 
diffraction signature of said patterned structure in the 
production line and generate data representative thereof; 

(d) supplying the generated data to said expert system, which 
analyses the data to thereby determine effective parameters of 
the process applied to said patterned structure in the 
production line; and 

(e) analyzing said effective process parameters to determine 

deviation thereof from corresponding nominal values to 
thereby enable the process control. 

2. The method according to Claim 1, wherein said providing of the reference data 
comprises the steps of preparing a parametric matrix including variations of at least 
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one process parameter within a group of different test fields; and applying 
measurements to said different test fields to determine corresponding signatures. 

3. The method according to Claim 2, wherein the preparation of the parametric 
matrix comprises the step of applying the process to be controlled to the different test 

5 fields utilizing different values of one or more process parameters. 

4. The method according to Claim 3, wherein said different values of one or more 
process parameters are randomly distributed in said different test fields. 

5. The method according to Claim 3, wherein said different fields are also 
characterized by different thickness values of underlying layers in the patterned 

10 structure. 

6. The method according to Claim 3, wherein the process is applied to the different 
fields in the same test structure. 

7. The method according to Claim 3, wherein the process is applied to the different 
fields in different test structures. 

15 8. The method according to Claim 5, wherein the process is applied to the different 
fields in different test structures. 

9. The method according to Claim 1, wherein the process to be controlled is a 
lithography process, the process parameters including at least one of exposure energy 
and focus of incident light. 
20 10. The method according to Claim 1, wherein said expert system is an artificial 
neural network. 

11. The method according to Claim 10, wherein said training of the expert system is 
carried out utilizing the reference data as a training set 

12. The method according to Claim 10, and also comprising the step of retraining the 
25 neural network with time by repeating steps (a) and (b) utilizing additional test fields. 

13. The method according to Claim 1, wherein said analyzing to determine the 
effective process parameters comprises fitting of said generated data and the 
reference data. 

14. The method according to Claim I, and also comprising the step of applying a 
30 reference tool to said test fields to thereby take additional measurements and generate 
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data representative of reference tool results indicative of a control window 
corresponding to the desired process results to be included in the reference data. 

15. The method according to Claim 10, and also comprising the step of verifying the 
effective process parameters resulting from the neural network analysis to ensure a 

5 high reliability score of the neural network results. 

16. The method according to Claim 15, wherein said verifying comprises searching in 
a signature library including results of a plurality of measurements applied to the 
structure and indicative of process parameters corresponding to various signatures. 

17. The method according to Claim 16, wherein said signature library is included in 
io said reference data. 

18. The method according to Claim 16, wherein the neural network results are 
compared to the search results to ensure that the neural network results are 
sufficiently close to said control window. 

19. The method according to Claim 16, wherein said searching is aimed at finding the 
15 diffraction signature, which corresponding process parameters within said control 

window are the closest ones to those of the neural network results; said verifying 
comprising comparing the found reference signature with the signature measured in 
the patterned structure in the production line to determine whether a merit function 
between said found signature and said measured signature is sufficiently low, being 
20 thereby indicative of the high reliability score of the neural network results. 

20. A method of controlling a lithography process to be applied to a wafer in a 
production run, the method comprising the steps of: 

(a) providing reference data including data representative of 
diffraction signatures corresponding to a group of different 
25 fields in a wafer similar to said wafer in the production line, 

and data representative of a control window for at least one of 
exposure and focus parameters of the process corresponding 
to a signature representative of desired lithography results, 
said group of different fields being characterized by different 
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values of at least one of said process parameters used in the 
manufacture of these fields; 

(b) providing an expert system trained to be responsive to input 
data representative of a diffraction signature to provide output 
data representative of corresponding effective values of said 
process parameters; 

(c) applying optical measurements to at least one site on said 
wafer in the production line to obtain at least one diffraction 
signature of said wafer and generate data representative 
thereof; 

(d) supplying the generated data to said expert system, which 
analyses the data to thereby determine effective values of said 
process parameters applied to said wafer in the production 
line; and 

(e) analyzing said effective process parameters to determine 

deviation thereof from corresponding nominal values to 

thereby enable the lithography process control. 
21. A production line for manufacturing patterned structures in a production run 
comprising: 

(Four) a processing tools arrangement characterized by certain values of 
its working parameters; and 

(Five) an optical measurement apparatus operable to apply a measurement to 
the structure and detect a diffraction signature indicative of light response of 
the structure to incident light, said diffiaction signature varying with a 
change in at least one of said working parameters; and 

(Six) an expert system trained to be responsive to input data representative of a 
diffraction signature to provide output data representative of corresponding 
effective value of said at least one working parameter, thereby enabling 
analysis of said effective value to determine deviation thereof from a 
corresponding nominal value and allow control of said process. 
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